Introduction
The purpose of this document is to give some insights to funding streams for selected companies as well as find out if there are any investors with specific ‘habits’ in the field.
Data Description
Data about funding streams is available via Crunchbase API.
Basically we have three datasets:
- Companies – all companies from broad list of categories (i.e. including not only ‘Artificial Intelligence’ or ‘Big Data’, but also weakly related fields), 10536 records
- Founders – all founders of companies from the previous item, 8041 records
- Funds Raised – all data about funding rounds for selected companies, 39100 records
Major findings
Total Funds Raised by Year

Total raised: AI vs. Big Data

Investors Type
There are only 855 records for amount of investments for total over 39 bln. USD. These records only available for organizations. For personal investors and vast majority of organizations we can only estimate amount of investments as we know funds raised by companies.

In total we have 1964 personal investors, and only 122 of them are among founders of selected companies, i.e. 6.2%.

Most Active Investors
Given we can hardly estimate precisely invested amounts by each investor – we’d better to analyse investors activity by number of companies which were invested. We can clearly see, that some investors have preference to Big Data and AI companies.

Let’s try select only investors active in the Big Data and AI companies, the criteria will be portion of these companies higher than 10%.

Most Active Investors (Social Graph Analysis)
Another view on investors activity in the field can provide social analysis. Image below shows top 20 active investors (by number of companies invested in). Investors are in red, Big Data companies are blue, AI companies are green, companies in both categories are brown, uncategorized companies are yellow.
We can clearly see that graph is connected, i.e. companies either have common investors or connected via several nodes. As we stated above most of companies are uncategorized and there are some investors with ‘specialization’ on Big Data / AI.

If we remove all uncategorized companies, then we get unconnected graph (i.e. investors are connected via ‘yellow’ companies).

In order to detect top influencers we can employ centrality measure (https://en.wikipedia.org/wiki/Centrality) – thus companies with higher number of links incident upon them become more ‘important’.
While first three influencers are the same as top investors, then we see smaller investors with high specialization in Big Data and AI.
